Causal Inference

Causal inference in machine learning refers to the process of identifying the causal relationships between variables in a dataset, often using techniques such as structural equation modeling (SEM), directed acyclic graphs (DAGs), and counterfactual reasoning. The goal is to estimate the causal effects of interventions or treatments on outcomes, allowing for the prediction of future behavior and decision-making under uncertainty.
See:

Uplift modelling

Causality

Resources

Free Causal Inference Resources
How big tech companies use Machine Learning and Causal Inference to make data-driven decisions | by Haytham Cheikhrouhou | Medium
From Meaningful Data Science to Impactful Decisions: The Importance of Being Causally Prescriptive
Six Causal Inference Techniques Using Python
XAI Stories (pbiecek.github.io)
- Chapter 5 Story Uplift Modeling: eXplainable predictions for optimized marketing campaigns | XAI Stories (pbiecek.github.io)
ATE vs CATE vs ATT vs ATC for Causal Inference | by Amy @GrabNGoInfo | GrabNGoInfo | Medium
Average Treatment Effects ATE vs CATE vs ATT vs ATC | Causal Inference
Applications of Causal Inference for Marketing: Estimating Treatment Effects for multiple Treatment
Causal Inference in A/B Testing: Navigating True Experimental Setups | by Jagadeesanmuthuvel | Medium
Demystifying Causality: An Introduction in Causal Inference and Applications. Part 4. | by IvanGor | Medium
Demystifying the Applications of Causal Inference in the Industry | by Olesia Badashova | Medium
Causal Inference with Continuous Treatments | by Ehud Karavani | Towards Data Science
Why Data Scientists Should Learn Causal Inference | by Leihua Ye, PhD | Medium

Double ML

Method for estimating (heterogeneous) treatment effects when all potential confounders/controls (factors that simultaneously had a direct effect on the treatment decision in the collected data and the observed outcome) are observed, but are either too many (high-dimensional) for classical statistical approaches to be applicable or their effect on the treatment and outcome cannot be satisfactorily modeled by parametric functions (non-parametric)
Orthogonal/Double Machine Learning — econml 0.15.1 documentation
Steps:
1. Splits the data into folds for cross-fitting.
2. Models two nuisance components:
  1. Propensity score: e(X) = P(T=1|X).
  2. Outcome regression: m(X, T) = E[Y|X, T].
3. Computes residuals to remove the effects of covariates:
  1. Outcome residual: Y^ = Y - m(X, T).
  2. Treatment residual: T^ = T - e(X).
4. Estimates treatment effect by regressing Y^ on T^.

Code

#CODE Causalai: Salesforce CausalAI Library - A Fast and Scalable framework for Causal Analysis of Time Series and Tabular Data
- Causal Analysis of Time Series and Tabular Data
#CODE EconML - Python package for estimating heterogeneous treatment effects from observational data via machine learning
- https://www.pywhy.org/
- Machine Learning Based Estimation of Heterogeneous Treatment Effects — econml 0.15.1 documentation
- EconML/notebooks at main · py-why/EconML · GitHub
- Causal inference (Part 2 of 3): Selecting algorithms | by Jane Huang | Data Science at Microsoft | Medium
- Part of the ALICE (Automated Learning and Intelligence for Causation and Economics) Microsoft Research project
#CODE CausalPy - A Python package for causal inference in quasi-experimental settings
- CausalPy - causal inference for quasi-experiments — CausalPy 0.0.14 documentation
#CODE Chirho - An experimental language for causal reasoning
- Causal Reasoning with ChiRho — chirho documentation (basisresearch.github.io)
- Causal probabilistic programming without tears — chirho documentation (basisresearch.github.io)
#CODE kochbj/Deep-Learning-for-Causal-Inference - Tutorials for learning how to build deep learning models for causal inference (HTE) using selection on observables in TF 2
#CODE GitHub - SUwonglab/CausalEGM: A General Causal Inference Framework by Encoding Generative Modeling
- CausalEGM Main Applications: Estimate average treatment effect (ATE), Estimate individual treatment effect (ITE), Estiamte average dose response function (ADRF), Estimate conditional average treatment effect (CATE), Built-in simulation and semi-simulation datasets.
- Tutorial for Python Users — CausalEGM documentation

Course

Talks

#TALK What is causal inference, and why should data scientists know? by Ludvig Hult - YouTube

References

#PAPER Adapting Neural Networks for the Estimation of Treatment Effects (2019)
#PAPER Causal Inference for Banking Finance and Insurance A Survey (2023)
#PAPER CausalEGM: a general causal inference framework by encoding generative modeling (2023)
#PAPER DAG-aware Transformer for Causal Effect Estimation (2024)
#PAPER Advancing Explainable AI with Causal Analysis in Large-Scale Fuzzy Cognitive Maps (2024)
#PAPER From prediction to prescription: Machine learning and Causal Inference (2024)
- Traditional ML models excel at making accurate predictions but often fall short in guiding interventions because they don't inherently account for causal relationships.
- Combining ML with causal inference techniques can bridge this gap, enabling models to not only predict outcomes but also prescribe actionable strategies
- By leveraging causal inference, ML models can identify which variables are truly influential and determine the potential impact of changing them, leading to more effective and reliable prescriptions
#PAPER Causal machine learning for predicting treatment outcomes (2024)